Climatic events have an impact on the population health and economic consequences worldwide.
Using the National Oceanic and Atmospheric Administration’s (NOAA) Storm Database we can measure the impact of most weather harmful events on US.
I concluded the most impact events are Tornado and flood.
I used for this investigation the following:
| Software | Value |
|---|---|
| platform | x86_64-pc-linux-gnu |
| arch | x86_64 |
| os | linux-gnu |
| language | R |
| version | R version 4.0.3 (2020-10-10) |
The support packages are:
library(tidyverse)
library(R.utils)
library(data.table)
library(DT)
library(knitr)
library(ggplot2)
library(plotly)
options(scipen=999)
First, I am Read the data for that I need:
Fread can make the Uncompress and read at same time.
StormData <- fread("Data_2/repdata_data_StormData.csv.bz2")
The table below have first 20 rows of raw data (you can play with the table, this is a dynamic table).
datatable(StormData[1:20,])
The columns PROPDMGEXP and CROPDMGEXP need some transformation. This because we have letters inside. Please check below table for more details.
datatable(as.data.frame(table(StormData$PROPDMGEXP)))
So, I need replace the letters by the right values for that I created a function called Economic_Losses that will receive a letter or number and will return like this:
, return 0
-, return 0
?, return 0
+, return 1
h, return 100
k, return 1000
m, return 1000000
b, return 1000000000
Other value, same value
tmp<-StormData
Economic_Losses <- function(variable) {
if (variable=="") {
return(0)
}
switch(variable,
"-"=return(0),
"?"=return(0),
"+"=return(1),
"h"=return(100),
"k"=return(1000),
"m"=return(1000000),
"b"=return(1000000000),
return(variable)
)
}
tmp$PROPDMGEXP<-as.character(tolower(tmp$PROPDMGEXP))
tmp$PROPDMGEXP<-as.numeric(lapply(tmp$PROPDMGEXP,Economic_Losses))
tmp$CROPDMGEXP<-as.character(tolower(tmp$CROPDMGEXP))
tmp$CROPDMGEXP<-as.numeric(lapply(tmp$CROPDMGEXP,Economic_Losses))
tmp<-tmp %>%
mutate(PROPDMG_T=PROPDMG*PROPDMGEXP) %>%
mutate(CROPDMG_T=CROPDMG*CROPDMGEXP)
After transformation now is just have numbers.
datatable(as.data.frame(table(tmp$PROPDMGEXP)))
First I group by EVTYPE
StormData_group<- tmp %>%
group_by(EVTYPE)
Then I summarise and pick only 10 most most harmful events
Note: I also need to sum the columns FATALITIES and INJURIES
StormData_summarise<-StormData_group %>%
summarise(FATALITIES=sum(FATALITIES),INJURIES=sum(INJURIES)) %>%
mutate(Total=FATALITIES+INJURIES)
StormData_TOP10<-StormData_summarise[order(StormData_summarise$Total,decreasing = TRUE),][1:10,]
StormData.long<-melt(StormData_TOP10[,1:3],id.vars="EVTYPE")
The output after transformation StormData summarize
datatable(StormData_summarise)
Now the same transformation for economics events
StormData_economics<-StormData_group %>%
summarise(PROPDMG=sum(PROPDMG_T),CROPDMG=sum(CROPDMG_T))%>%
mutate(Total=PROPDMG+CROPDMG)
StormData_TOP10_economics<-StormData_economics[order(StormData_economics$Total,decreasing =TRUE),][1:10,]
StormData.economics<-melt(StormData_TOP10_economics[,1:3],id.vars="EVTYPE")
We can easily verify the most harmful events with respect to population health is Tornado. We also can verify on below images the fatalities and injuries both have the tornado as harmful event.
Related with most harmful events with respect to economic consequences the flood is most harmful. We can verify the crop is more affected by drought then flood.
p1<-ggplot(data=StormData.long, aes(x=reorder(EVTYPE,value), y=value,fill=variable)) +
geom_bar(stat="identity")+
coord_flip()+
labs(title = "")+
xlab("Event type")+
ylab("Sum ")
pp1<-ggplotly(p1)
p2<-ggplot(data=StormData.economics, aes(x=reorder(EVTYPE,value), y=value,fill=variable)) +
geom_bar(stat="identity")+
scale_fill_manual(values = c("green","black"))+
labs(fill = "Value")+
coord_flip()+
labs(title = "Most harmful events")+
xlab("Event type")+
ylab("Sum ")
pp2<-ggplotly(p2)
subplot(pp1,pp2,nrows = 2)
Note: this figures are interactive so you can play with images.